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Abstract 
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^Dlat  I 

A  distributed  security  architecture  is  proposed  for  incorporation  into  group  oriented  dis-i 
tributed  systems  and,  in  particular,  into  the  Isis  distributed  programming  toolkit.  The  primary 

goal  of  the  architecture  is  to  preserve  the  Isis  abstractions  in  hostile  environments.  These  ab-  - 

stractions  include  process  groups  and  causal  and  atomic  group  multicast.  .Moreover,  a  delegation 
and  access  control  scheme  is  proposed  for  use  in  group  oriented  systems.  The  focus  of  the  paper 
is  the  security  architecture;  particular  security  protocoWand  cryptosystems  are  not  emphasized. 


1  Introduction 


Systems  that  address  security  issues  in  distributed  environments  have  traditionally  been  constructed 
upon  the  remote  procedure  call  (RPC)  paradigm  of  communication  (e.g..  [Bir85.SNS88.Sat89. 

‘This  work  was  supported  by  the  Defense  Advanced  Research  Projects  Agency  (DoD)  under  D.ARP.\/N'.ASA 
subcontract  NAG2-593  administered  by  the  NASA  Ames  Research  Center,  by  grants  from  GTE.  IBM.  and  Siemens. 
Inc.,  and  by  a  National  Science  Foundation  Graduate  Fellowship.  Any  opinions,  conclusions  or  recommendations 
expressed  in  this  document  are  those  of  the  authors  and  do  not  necessarily  reflect  the  views,  policies  or  decisions  of 
the  National  Science  Foundation  or  the  Department  of  Defense. 

^Visiting  at  Cornell  University.  Usual  address:  ORA  Corporation.  675  Massachusetts  .Avenue.  Cambridge.  Mas¬ 
sachusetts  02139. 
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TY91]).  Many  systems,  however,  utilize  more  general  types  of  communication  which  have  not 
enjoyed  the  same  amount  of  attention  from  the  security  community.  One  such  alternative  is  group 
oriented  communication,  based  on  the  process  group  abstraction  [BCG91].  Process  groups  have 
been  incorporated  into  many  distributed  systems  [OSS80,CZS5,BJ87.PBS89.LLS90.KT9l]  and  have 
been  shown  to  facilitate  the  implementation  of  complex  distributed  algorithms  and  fault  tolerant 
applications  [CZ85,BJ87,LLS90].  The  benefit  in  preserving  the  process  group  abstraction  in  hostile 
environments  could  therefore  be  great.  In  particular,  it  would  facilitate  the  construction  of  securely 
fault  tolerant  applications,  i.e.,  fault  tolerant  applications  which  remain  correct  even  when  under 
malicious  attack. 

To  illustrate,  consider  a  stock  brokerage  that  wishes  to  establish  an  online  trading  service  at  the 
New  York  Stock  E.xchange  (NYSE).  Since  the  majority  of  the  firm's  trading  at  the  NYSE  will 
be  executed  through  this  trading  service,  its  availability  and  performance  are  crucial,  and  thus  it 
must  be  replicated.  The  firm's  programmers  therefore  choose  to  implement  the  service  as  a  fault 
tolerant  process  group  in  their  favorite  group  oriented  programming  environment.  Wliile  the  firm 
cap  protect  its  own  sites  at  the  exchange  from  corruption,  the  firm’s  programmers  cannot  trust  that 
other  sites,  or  the  network  by  which  the  sites  communicate,  will  behave  as  expected.  Nevertheless, 
interaction  with  other  sites  is  necessary  for  efficient  trading,  and  thus  the  group  is  forced  to  execute 
in  a  potentially  hostile  environment.  In  particular,  intruders  (e.g..  corrupt,  competing  traders) 
may  attempt  to  infiltrate  the  group,  alter  or  forge  group  communication,  or  undermine  the  group 
by  tampering  with  the  group  abstractions  on  which  the  consistency  and  correctness  of  the  service 
relies.  If  the  group  oriented  programming  environment  does  not  defend  at  least  its  own  al)STractions. 
then  the  firm’s  programmers  are  faced  with  an  unattractive  choice;  either  they  reimplement  the 
group  oriented  programming  environment  to  include  defenses  against  these  forms  of  attack,  or  they 
dismiss  the  process  group  approach  and  resort  to  other,  possibly  less  favorable  ones. 

This  paper  presents  a  distributed  security  architecture  to  be  integrated  with  group  oriented  systems 
and,  in  particular,  with  the  Isis  toolkit  [BJ87].*  Isis  is  a  toolkit  for  distributed  programming  wliich 
provides  process  group  and  reliable  group  multicast  abstractions.  With  respect  to  Isis,  the  aims 
of  this  work  are  threefold.  The  first  is  to  weaken  the  e.xecution  model  assumed  by  Isi-,  >o  that 
malicious  behaviors  are  admitted,  while  still  preserving  the  abstractions  provided  by  Isis.  This 
change  will  enable  programmers  to  rely  upon  the  Isis  abstractions  even  in  hostile  environments  and 
thus  will  facilitate  the  construction  of  fault  tolerant  services  which  remain  correct  even  when  under 
malicious  attack.  The  second  is  to  enhance  the  Isis  abstractions  to  be  more  suitable  for  use  in  a 

'More  specifically,  this  security  architecture  is  tailored  to  a  reimplementation  of  the  Isis  toolkit  calleti  Horns, 
named  after  the  son  of  Isis  in  Egyptian  mythology.  In  this  paper,  we  will  use  Horns  terminology,  which  may  be 
unfamiliar  to  users  of  earlier  versions  of  Isis. 
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hostile  environment.  The  third  is  to  accomplish  the  first  two  without  unreasonably  degrading  the 
performance  of  the  toolkit. 

The  goal  of  this  paper  is  to  describe  the  major  features  of  our  security  architecture.  In  particular, 
we  do  not  discuss  specific  cryptosystems  or  security  protocols  in  detail.  We  instead  focus  on  the 
mechanisms  we  use  to  protect  the  abstractions  which  are  fundamental  in  Isis  and,  we  believe,  in  a 
group  oriented  setting. 

The  rest  of  this  paper  is  structured  as  follows.  We  begin  in  section  2  with  a  more  detailed  but 
informal  description  of  the  abstractions  provided  by  Isis.  Then,  in  section  3  we  discuss  the  system 
model  for  which  Isis  is  designed  and  the  model  we  consider  by  weakening  it  to  allow  malicious 
behaviors.  At  this  point  we  will  be  in  a  position  to  clarify  our  goals  and  to  enumerate  what  is 
needed  to  achieve  them.  Section  4  addresses  this  and  proposes  an  architecture  to  acliieve  these 
requirements.  Finally,  section  5  describes  a  delegation  and  access  control  scheme  for  use  in  group 
oriented  systems.  We  end  with  a  discussion  of  future  directions  of  research. 


2  The  Asis  Abstractions 

The  abstractions  provided  by  Isis  can  be  separated  into  two  types,  namely  the  process  group  and 
virtual  sj/nchrony  abstractions.  A  process  group  is  simply  a  collection  of  processes  with  an  associated 
group  address.  Usually  a  process  group  is  created  for  cooperation  in  a  distributed  task  such  as 
replicating  data,  processing  data  in  parallel,  or  providing  fault  tolerance,  although  Isis  enforces 
no  restrictions  on  the  purposes  for  which  groups  «ire  formed.  Groups  may  overlap  arbitrarily,  and 
processes  may  create  and  join  groups  at  any  time.  Mor^ver.  a  process  may  leave  a  group,  either 
by  requesting  to  do  so  or  by  failing  (i.e.,  crashing).  X  group  G  can  thus  be  seen  as  progressing 
through  a  sequence  viewo(G),  vjewi(G), ...  of  mews,  where 

1.  viewi(G)  C  P,  where  P  is  the  set  of  till  processes  in  the  system. 

2.  viewo(G)  =  0,  and 

3.  vjewi(G)  and  view,+i(G)  differ  by  the  addition  or  subtraction  of  e.xactly  one  process. 

Members  of  a  group  learn  about  the  membership  of  the  group  through  certain  events.  More  pre¬ 
cisely,  execution  of  a  process  p  E  P  is  modeled  as  a  sequence  e^,  e‘[. ...  of  events,  each  corresponding 
to  the  execution  of  an  indivisible  action.  One  such  pos.sible  event  is  the  delivery  of  the  /  'tli  group 
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view  view,(G)  of  a  process  group  G,  denoted  by  viewp(j,  G).  Views  are  delivered  to  processes  in 
sequential  order,  although  p  observes  viewp(t  +  1,G)  if  and  only  if  p  6  viewi(G)u  view, +i{G);  i.e.. 
a  process  observes  only  those  subsequences  of  viewo(G),  viewi(G), . . .  which  begin  when  it  becomes 
a  member  and  end  when  it  is  removed.  If  p  6  viewj(G),  then  when  vieWp(i,G)  is  observed  at  p. 
we  say  that  p  is  m  the  i’th  group  view  of  G  until  the  event  viewp(i  +  l.G). 

The  primary  means  of  communication  in  Isis  is  group  multicast.  A  process  in  a  group  can  multi¬ 
cast  to  the  group  by  specifying  the  group  address  as  the  destination.^  The  abstraction  of  virtual 
synchrony  consists  of  certain  delivery  guarantees  regarding  group  multicasts.  First,  all  destination 
processes  of  the  message  are  in  the  same  group  view  when  the  message  is  delivered,  and  the  set  of 
destination  processes  is  precisely  the  members  of  that  view.  Second.  aU  operational  destinations 
eventually  deliver  the  message,  or,  and  only  if  the  sender  fails,  none  do.  Third,  when  multiple  des¬ 
tinations  receive  the  same  messages,  they  observe  consistent  delivery  orders,  in  one  of  the  following 
two  senses. 

The  first  and  least  restrictive  delivery  ordering  of  interest  is  the  causal  delivery  ordering,  based 
upon  the  potential  causality  relation  defined  in  [Lam78].  To  define  this  ordering,  we  introduce  two 
more  types  of  events  which  can  be  executed  or  observed  by  a  process.  If  p  is  in  some  view  of  G, 
then  ef  might  be  the  event  which  multicasts  a  message  m  to  G,  denoted  mcastp(  m.  G).  or  which 
delivers  to  p  a  message  m  multicast  to  G,  denoted  deliverp(  m.  G).  The  potential  causality  relation 
is  defined  as  the  irreflexive,  transitive  closure  of  the  smallest  relation  satisfying 

1.  for  all  t  and  p,  ef  and 

2.  for  all  m,  p,  q,  and  G,  mcastp(m,G)~*  deliver,!^. G). 

Isis’  causal  delivery  ordering  property  guarantees  that  if  mcastp(m.G)  — ►  mcast,(  m'.  G'),  then 
at  any  common  destination  r,  deIiverr(m,G)  -♦  deliverp(/n'.  G').  In  words,  if  the  multicast  of 
message  m  causally  precedes  the  multicast  of  message  m'.  then  ni  is  delivered  before  m'  at  any 
common  destination.  The  multicast  protocol  which  implements  this  property  is  called  CBC.\ST. 

This  delivery  ordering  can  be  extended  to  a  total  ordering  in  the  following  sense,  which  expresses 
the  second  and  more  restrictive  delivery  ordering  provided  by  Isis:  two  messages  sent  concurrently 
(i.e.,  that  are  not  related  causally)  to  the  same  group  are  delivered  to  all  members  of  the  group  in 
the  same  order.  In  terms  of  this  property  is  specified  by  additionally  requiring  that  if  at  some 

’ Actually,  nonmembers  can  also  multicast  to  a  group  in  Isis,  although  in  this  paper  we  do  not  consider  this  case 
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p,  deliverp(m,  G)  deliverp(m',  G),  then  at  all  other  common  destinations  q.  deliverq(  m.  G)  — 
deliverq(m',  G).^  This  property  is  implemented  by  the  ABCAST  protocol. 

3  The  System  Model 

The  basic  system  model  for  which  Isis  is  implemented  is  a  very  benign  one.  Informally,  the  system 
consists  of  a  set  S  of  sites  which  execute  the  set  P  of  processes. Sites  and  processes  may  fail,  but 
only  by  crashing  detectably,  and  if  a  site  fails  then  so  do  the  processes  residing  upon  it.  The  sites, 
and  the  processes  they  host,  communicate  via  an  asynchronous  network:  no  bounds  on  message 
transmission  delays  are  assumed. 

The  system  model  we  consider  in  this  paper  is  obtained  by  weakening  aspects  of  this  model  in 
various  ways,  namely  by  allowing  the  network  or  sites  to  be  corrupted  by  an  intruder.  In  the 
terminology  of  [VK83],  a  corrupt  network  may  suffer  certain  passive  attacks,  namely  the  release 
of  message  contents,  and  certain  active  attacks,  namely  message-stream  modification,  spurious 
association  initiation,  and  denial  of  message  service,  at  the  hands  of  an  intruder.  Release  of  message 
contents  occurs  when  an  intruder  simply  observes  intelligible  messages  passing  over  the  network 
without  interfering  with  their  flow.  Message- stream  modification  includes  transient  attacks  on  the 
authenticity,  integrity  and  ordering  of  messages.  Spurious  association  initiation  includes  attacks 
in  which  an  intruder  replays  a  previously  recorded  association  initiation  sequence  or  attempts  to 
establish  associations  under  a  false  identity.  Lastly,  denial  of  message  service  attacks  are  essentially 
persistent  message- stream  modification  attacks  and  often  include  discarding  or  delaying  all  messages 
between  two  communicating  endpoints.  Note,  however,  fhat  we  do  not  consider  one  other  type  of 
attack  enumerated  in  [VK83],  namely  traffic  analysis.  In  a  traffic  analysis  attack  an  intruder  gathers 
information  about  the  contents  of  unintelligible  messages  from  the  frequency  of  transmissions  and 
the  lengths,  sources  and  destinations  of  the  messages.  In  this  paper  we  make  no  effort  to  deal  with 
traffic  analysis  attacks. 

In  addition  to  network  corruption,  there  e.\ists  a  set  C  C  S  of  corrupt  sites.  .A.  corrupt  site  may 
exhibit  arbitrarily  malicious  behaviors,  limited  only  by  the  aforementioned  network  assumptions 

^Several  other  reasonable  definitions  are  possible  for  a  total  delivery  order,  although  current  plans  for  Horns 
include  implementation  of  this  one  only.  Another  option  which  may  be  considered  in  the  future  is  if  at  some  p. 
deliverpfm,  G)  — >  deliverp(m'.  G'),  then  at  all  other  common  destinations  7.  deliver.^ ( m.  G)  —  deliver,,!  m'.  CV') 
However,  we  will  not  concern  ourselves  with  this  in  this  paper. 

^Unless  otherwise  stated,  throughout  this  paper  the  term  '‘process”  refers  to  an  application  process,  and  the  term 
“site”  refers  to  a  workstation  running  an  operating  system  and.  once  added.  Isis. 


and  the  assumption  that  it  is  computationally  infeasible  for  an  intruder  to  break  the  cryptosystems 
we  employ.  Intuitively,  a  corrupt  site  is  one  on  which  the  operating  system  or  Isis  code  or  data  has 
been  either  accidentally  or  maliciously  altered,  disrupted  or  replaced. 

In  this  work  we  make  two  assumptions  about  the  operating  system  running  at  each  site,  if  not 
corrupt.  First,  we  assume  that  it  authenticates  in  a  secure  fashion  the  user  identifiers  of  the 
processes  it  executes.®  Second,  we  assume  that  it  provides  protected,  private  address  spaces  for. 
and  private,  authentic  message  passing  between,  both  system  and  user  processes  local  to  the  site. 
This  includes  the  protection  of  virtual  address  spaces  stored  on  external  media.^ 


4  Protecting  the  Isis  Abstractions 


Given  this  statement  of  the  system  model,  we  are  now  in  a  position  to  specify  our  goals  more 
precisely.  Intuitively,  given  a  process  group,  we  would  like  to  preserve  the  abstractions  guaranteed 
to  the  members  of  the  group  by  Isis,  as  described  in  section  2.  That  is.  we  would  like  to  guarantee 
that  a  process  in  a  group  observes  a  correct  sequence  of  events.  This  is  clearly  impossible,  however, 
if  a  site  hosting  a  group  member  is  corrupt,  because  that  site  can  cause  arbitrary  events  to  be 
observed  in  any  order  by  any  process  it  hosts!  We  thus  restrict  our  efforts  to  process  groups  wliich 
are  hosted  by  sites  not  in  C.  That  is,  let  sites, (G)  be  the  set  of  sites  hosting  the  members  of 
viewi(G),  and  let  an  uncorrupt  group  G  be  one  such  that  C  n  (U,  sites, (G))  =  0.  Then,  our  goal 
is  to  modify  Isis  to  guarantee  that  in  any  uncorrupt  group  G  the  Isis  abstractions  are  observed  by 
processes  in  (J,  view,(G)  with  respect  to  group  G.  Accordingly,  for  the  remainder  of  the  paper, 
when  we  speak  about  protecting  a  certain  abstraction  with  respect  to  a  process  group,  we  assume 
that  the  group  is  uncorrupt.’  In  section  5  we  justify  this  assumption  by  addressing  the  question  of 
how  access  to  groups  can  be  controlled. 

We  now  consider  what  must  be  done  to  achieve  this  goal.  Of  course,  in  a  single  paper  we  could  not 
hope  to  detail  every  step  which  is  required  to  acliieve  tliis  in  a  toolkit  as  complex  as  Isis.  Thus,  we 

*This  is  a  rather  strong  requirement,  but  the  mechanisms  described  in  this  paper  facilitate  its  implementation.  For 
example,  if  smart-card  technology  is  available,  each  user  and  site  can  be  treated  as  an  Isis  group  and  the  delegation 
mechanisms  of  section  5  can  be  used  to  authenticate  the  user  identifiers  of  processes  executed  from  remote  sites,  in 
a  fashion  similar  to  that  of  DSSA  [GM90].  Even  without  such  technology,  the  authentication  mechanisms  of  section 
4.1  provide  a  secure  communication  channel  between  any  two  sites  which  can  He  useil  to  facilitate  the  auilienticatioii 
of  user  identifiers. 

*If  the  operating  system  pages  over  the  network,  this  requires  the  use  of  a  pager  which  encrypts  as  it  pages. 

^We  also  assume  that  the  Isis  failure  detector  and  name  sert’ice  are  not  corrupt  and  reside  on  iincorrupt  sites. 
These  services  will  be  mentioned  in  sections  4.1  and  4.2,  respectively. 
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limit  our  discussion  to  what  we  see  as  the  three  major  obstacles,  wliich  we  outline  below  and  later 
discuss  in  detail. 

First,  due  to  the  very  fact  that  preservation  of  the  abstractions  requires  communication,  a  necessary 
step  is  to  develop  a  subsystem  which  provides  message  authentication.  In  particular,  this  subsystem 
shotild  allow  a  site  in  a  group  (i.e.,  a  site  hosting  a  member  of  a  group)  to  detect  attempts  by  an 
intruder  to  insert,  alter  or  replay  group  messages  or  to  impersonate  another  site  in  the  group.  If 
we  can  achieve  this,  a  site  in  the  group  can  rely  upon  the  legitimacy  and  authenticity  of  messages 
apparently  from  other  sites  in  the  group.  In  section  4.1  we  propose  an  authentication  subsystem 
which  accomplishes  these  goals. 

Once  we  have  it,  an  authentication  subsystem  of  this  strength  yields  additional  benefits.  Since 
altered  messages  are  detected  (and  ignored),  denial  of  message  service  becomes  indistinguishable 
from  lengthy  message  delivery  times.  And,  since  Isis  is  constructed  for  an  asynchronous  environ¬ 
ment,  Isis  will  behave  correctly  under  such  attacks.  That  is.  while  a  network  intruder  can  proliibit 
the  liveness  of  Isis  and  cause  sites  to  be  mistakeidy  deemed  faulty  by  denying  message  service,  these 
attacks  will  not  result  in  the  violation  of  any  safety  properties  guaranteed  by  Isis  to  an  uncorrupt 
group.  Similarly,  attempts  by  an  intruder  to  reorder  messages  on  the  network  are  fruitless,  as  Isis 
assumes  that  the  network  can  do  this  anyway. 

The  authentication  subsystem  therefore  nullifies  the  active  network  attacks  described  in  section  3. 
as  well  as  any  attempts  to  impersonate  sites  in  an  uncorrupt  group.  Moreover,  the  authentication 
subsystem  severely  limits  other  types  of  attacks  an  intruder  can  mount.  For  e.xample.  the  majority 
of  the  process  group  abstraction  and  the  first  two  aspects  of  virtual  synchrony  listed  in  section  2  are 
easily  preserved,  because  their  implementations  require, communication  local  to  the  group,  which 
by  assumption  contains  only  uncorrupt  sites. 

The  second  obstacle,  and  the  remaining  weakness  in  the  process  group  abstraction,  lies  in  the 
protocol  by  which  a  process  joins  a  group.  In  a  request  to  join  a  group  the  process  specifies 
the  group  address,  but  unless  the  process’  site  can  authenticate  the  group  specified  by  the  group 
address,  then  the  first  group  view  the  process  observes,  and  hence  all  subsequent  group  views,  iiiay 
be  fallacious.  A  related  issue  which  must  be  addressed  is  how  a  process  can  obtain  an  authentic 
group  address  for  a  group  it  wishes  to  join.  In  section  4.2  we  address  these  issues. 

Third,  the  intruder  can  greatly  complicate  the  definition  and  preservation  of  causality  in  our  system 
model.  In  addition  to  Isis,  numerous  other  systems  have  implemented  protocols  which  guarantee 
causal  delivery  orderings  among  messages  (e.g..  [PBS8D.LLS90] ).  and  thus  interest  in  preserving 
causality  in  hostile  environments  is  not  solely  among  Isis  users.  Moreover.  CBC.A.ST  (i.e..  causal 


multicast)  is  central  to  virtual  synchrony  in  Isis  also  because  ABC  AST,  the  protocol  which  imple¬ 
ments  the  total  ordering  property,  is  implemented  in  terms  of  it  [BSS90].  Section  4.3  is  devoted  to 
the  problems  of  understanding  and  preserving  causality  in  our  system  model. 


4.1  Authentication 

We  introduce  authentication  mechanisms  at  the  lowest  layer  of  the  Isis  toolkit,  namely  the  Multicast 
Transport  Service  (MUTS)  [vR9l].  A  copy  of  MUTS  resides  on  each  site,  logically  at  the  transport 
layer  of  the  ISO  OSI  Reference  Model,  and  provides  to  the  layers  above  it  at-most-once.  sequenced 
communication  to  other  sites.  MUTS  has  the  job  of  providing  these  abstractions  while  insulating 
the  higher  layers  from  the  particular  transport  protocol  used,  wliich  may  exploit  hardware  multicast 
capability. 

For  our  purposes,  the  MUTS  layer  is  the  obvious  place  at  which  to  authenticate  messages.  Indeed, 
it  would  be  fruitless  to  authenticate  messages  only  at  higher  layers  of  the  Isis  system,  as  then 
they  could  not  rely  upon  the  abstractions  provided  by  MUTS.  And.  other  systems  (e.g..  DASH 
[AR87])  have  identified  additional  advantages  in  authenticating  between  sites,  as  opposed  to  at 
higher  levels.® 

Before  presenting  our  authentication  mechanisms,  we  must  briefly  consider  how  MUTS  works.  The 
primary  structure  recognized  by  MUTS  is  the  group  entity.  The  group  entity  corresponding  to  the 
process  group  G  is  the  collection  of  sites  hosting  members  of  the  group,  and  accordingly  it  progresses 
through  the  sequence  siteso(G),sitesi(G), . ..  of  sets.  A  MUTS  layer  learns  about  changes  to  the 
membership  of  a  group  entity  from  the  layer  above  it.  ^which  communicates  with  other  .sites  in 
the  group  entity  and  with  the  Isis  failure  detector  [BJ87,RB91]  to  make  this  determination.  Each 
MUTS  layer  thus  has  a  current  member  list  of  each  group  entity  it  is  in.®  When  MUTS  receives  a 
message  from  a  higher  layer  to  be  multicast  to  a  group  entity,  it  opens  a  connection  to  the  members 
of  its  current  member  list  for  the  group  entity,  if  one  does  not  already  exist.  A  connection  is 
associated  with  exactly  one  group  entity  and  is  simply  a  logical  end-to-end  data  path  from  the 
originating  site  to  the  other  sites  in  the  originator’s  member  list.  If  a  site  is  removed  from  the 
originator’s  member  list,  it  is  also  removed  from  the  connection,  but  if  a  site  is  added  to  the 

*Moreover,  by  our  eissumptions  of  section  3  regarding  the  operating  system  and  by  the  fact  that  all  Isis  commu¬ 
nication  flows  through  MUTS,  authentication  between  MUTS  layers  also  prevents  any  impersonation  whatsoever  of 
Isis  by  a  malicious  user  process  residing  on  an  uncorrupt  site. 

*Member  lists  should  not  be  confused  with  group  views  delivered  to  the  application  process.  Tlte  latter  constitute 
the  process  group  abstraction  and  are  synchronized  with  communication  as  described  in  section  2.  The  former, 
however,  consist  of  sites  and  are  not  coordinated  with  incoming  communication  or  other  events. 
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originator’s  member  list,  the  old  connection  is  disassembled  and  a  new  connection  is  negotiated  for 
the  new  member  list.  To  send  a  message  on  a  connection,  MUTS  breaks  the  message  into  packets, 
and  hands  these  packets  to  the  transport  protocol  for  transmission.  Each  packet  carries  with  it 
the  connection  number  and  a  sequence  number  for  the  connection.  Connection  numbers  are  unique 
system-wide,  and  the  sequence  numbers  for  a  connection  form  an  increasing  sequence.  When  the 
sequence  reaches  its  upper  bound,  the  connection  is  disassembled  and  a  new  one  is  negotiated. 
Packet  acknowledgements  are  managed  by  MUTS  with  a  sliding- window  protocol  which  has  been 
adapted  for  use  with  multicast  communication. 

Techniques  for  authenticating  messages  (or  in  this  case,  packets)  have  existed  in  the  literature 
for  many  years.  Traditionally,  these  methods  have  depended  upon  encryption,  but  methods  based 
upon  pseudo-random  functions  are  theoretically  at  least  as  attractive.  Informally,  a  pseudo- random 
function  /  has  the  property  that  if  /  is  unknown,  it  is  computationally  infeasible  to  produce  /( m ) 
for  any  m  with  a  probability  of  success  greater  than  random  guessing,  even  after  having  seen  several 
other  (m',/(m'))  pairs.  Thus,  given  a  family  of  pseudo-random  functions  {fK}K^ic-  indexed  by 
keys  from  some  key  space  K,,  two  parties  which  share  a  secret  key  K  can  authenticate  their  messages 
to  each  other  by  appending  /A'(m)  to  each  message  m  [Riv90a].  Moreover,  they  can  be  sure  of 
the  freshness  of  their  messages  if  timestamps,  nonce  identifiers,  or  sequence  numbers  are  included 
therein. 

In  Isis,  we  will  employ  an  efficient  approximation  of  a  pseudo-random  hash  function:  two  candidates 
are  fK{^)  =  9iK,M)  and  fh-iM)  =  where  p  is  a  stifficiently  strong  one-way  hash 

function  (e.g.,  [Riv90b])  and  Ef^  is  an  encryption  function  (e.g..  [DES77])  with  key  K.  Given  such 
an  approximation,  authentication  methods  based  upon  pseudo-random  functions  are  generally  more 
efficient  than  those  based  upon  encryption.  However,  enctyption  is  more  useful  for  defending  agednst 
the  release  of  message  contents,  and  for  some  applications  this  is  desirable.  We  will  thus  offer  both 
alternatives  —  when  sending  a  message,  an  application  process  can  reqtiest  that  it  be  encrypted 
or  that  it  be  authenticated  via  a  pseudo-random  function.  For  the  rest  of  this  section,  we  discuss 
only  the  latter  option;  methods  using  encryption  are  similar. 

For  MUTS  we  generalize  the  ideas  presented  above  to  take  advantage  of  hardware  multicast  capa¬ 
bilities  that  may  be  e.xploited  by  the  transport  protocol.  Instead  of  establislung  a  shared  key  for 
every  pair  of  MUTS  layers,  we  establish  a  shared  key  per  connection,  called  a  connection  key.  The 
connection  key  is  a  secret  held  by  the  sites  involved  in  the  connection  and  is  used  to  authenticate 
messages  sent  on  the  connection.  When  a  connection  is  created,  the  site  initiating  the  connection 
generates  a  fresh  connection  key  A'  and  distributes  it  to  the  sites  on  its  member  list.  Then,  the 
multicast  “P,  of  packet  P  on  the  connection  can  be  verified  at  all  destinations  (and  a  packet 
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encrypted  under  K  can  be  deciphered  only  at  sites  involved  in  the  connection).  Moreover,  provided 
that  the  connection  is  fresh,  each  destination  can  verify  that  the  packet  is  fresh,  because  P  contains 
the  sequence  number  for  the  connection.  Here  we  do  not  detail  the  protocol  by  which  a  connec¬ 
tion  is  opened,  although  we  remark  that  freshness  of  connections  is  guaranteed  by  incorporating 
timestamps into  the  appropriate  connection  initiation  messages. 

In  order  to  authenticate  and  distribute  connection  keys,  we  employ  a  group  key.  This  is  actually  a 
private  key/public  key  pair,  possession  of  the  private  component  of  wliich  is  evidence  of  membership 
in  the  group  entity.  The  group  key  for  a  group  is  created  by  the  site  hosting  the  first  member  of 
the  group,  and  as  other  processes  join,  the  group  key  is  given  to  their  sites.  Connection  keys  for  a 
group  are  thus  communicated  in  the  obvious  way,  encrypted  with  the  public  key  of  the  group  and 
signed  with  the  private  key  of  the  group. 

It  must  be  emphasized  at  this  point  that  sites  hold  connection  keys  and  private  keys  of  groups:  user 
processes  do  not.  Thus,  when  a  process  leaves  a  group  voluntarily,  the  site  on  which  it  resides  can 
destroy  the  group  and  connection  keys  which  it  held  on  behalf  of  the  process.  By  doing  so.  if  the 
site  is  subsequently  corrupted,  the  intruder  will  not  be  able  to  masquerade  as  a  group  member.  “ 
Similarly,  if  a  member  process  crashes,  again  the  site  will  destroy  the  keys  it  held  on  the  process’ 
behalf.  If  the  entire  site  crashes,  we  rely  upon  the  loss  of  volatile  storage  to  ehminate  all  keys  from 
memory. 

Of  course,  we  must  discuss  how  group  keys  are  distributed.  Like  all  other  key  distribution  schemes, 
in  order  to  distribute  a  group  key  we  require  some  form  of  an  authentication  service,  i.e..  an  a 
priori  trusted  authority.  We  choose  to  employ  a  public  key  authentication  service  due  to  the 
security  advantages  which  can  be  achieved  [Dif88].  Associated  with  the  authentication  service 
is  a  private  key  (known  only  to  the  service)  and  a  corresponding  public  key.  The  public  key  is 
given  to  the  MUTS  layer  on  each  site,  along  with  the  site’s  own  site  key  (a  pri%’ate  key /public 
key  pair),  when  the  site  is  booted. Once  the  site  is  booted,  it  requests  from  the  authentication 
service  its  certificate,  which  contains  the  identifier  and  public  key  of  the  site  and  the  e.xpiration 

^“See  appendix  A. 

^^Formally  our  assumptions  exclude  the  subsequent  corruption  of  the  site  of  a  former  group  member,  although  in 
practice  this  erasure  of  keys  is  prudent.  Our  assumptions  also  omit  the  case  in  whicli  a  site,  due  to  corruption,  does 
not  destroy  the  group  or  connection  keys  when  its  process  leaves  the  group. 

'*The  boot  procedure  appropriate  for  each  site  in  a  particular  setting  is  ilepeinlettt  on  inanv  factors,  .-nch  as  the 
physical  security  of  the  site,  whether  the  site  is  diskless,  and  the  role  of  the  Mte  in  the  s\sietn.  Thus,  a  complete 
discussion  of  this  issue  is  outside  the  scope  of  this  paper.  However,  the  boor  procedure  itsed  at  each  'ite  shoitid 
prevent  an  intruder  from  booling  the  site  with  false  operating  system  or  Isis  code  or  with  a  false  authentication 
service  public  key. 
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time  of  the  public  key,  aJl  signed  by  the  private  key  of  the  authentication  service.  Each  site's 
certificate  is  subsequently  stored  at,  and  disseminated  from,  the  site  itself.  Tliis  method  of  storing 
certificates  has  the  benefit  of  eliminating  the  need  for  a  public  key  or  certificate  repository  as  is 
used  in  many  security  architectures  (e.g..  Strongbox  [TY91]),  and  although  we  will  not  describe 
our  key  distribution  protocols  here,  it  also  does  not  in  general  increa.se  the  message  complexity  of 
our  protocols  due  to  the  particular  patterns  of  communication  seen  in  Isis.  We  emphasize  that 
no  interaction  with  the  authentication  service  is  required  to  distribute  the  private  key  of  a  group. 
Ideally  the  authentication  service  would  interact  with  a  site  only  when  it  needs  to  give  the  site 
a  fresh  certificate,  and  indeed  the  authentication  service  could  be  taken  offline  until  such  a  need 
arises. 

We  caji  generalize  this  scheme  by  allowing  multiple  authentication  services,  as  originally  proposed 
in  [BLNS86].  Intuitively,  each  authentication  service  would  be  responsible  for  generating  certifi¬ 
cates  for  some  subset  of  the  sites,  and  each  site  would  be  given,  at  boot  time,  the  public  key  of  the 
authority  it  should  trust.  This  generality  has  consequences,  however,  in  the  sense  that  authenti¬ 
cation  among  sites  which  trust  different  authentication  services  becomes  complex.  One  solut*on  is 
to  allow  several  authentication  services  to  generate  certificates  for  the  same  site,  and  another  is  to 
employ  “higher  authorities,”  much  like  the  cross-certifying  authorities  of  SPX  [TA91].  to  vouch  for 
the  public  keys  of  other  authentication  services.  The  details  of  this  have  not  yet  been  sufficiently 
investigated,  however,  and  so  we  wiU  not  discuss  them  further  here.  In  the  first  implementation  of 
this  system,  we  intend  to  use  a  single  authentication  service,  and  this  generalization  is  regarded  as 
an  enhajicement  for  later  development. 


4.2  Joining  Groups  I 

As  described  earlier,  the  protocol  by  which  a  process  joins  a  group  is  crucial  to  the  process  group 
abstraction,  because  if  this  is  not  secure,  an  intruder  may  cause  the  process  to  observe  fallacious 
group  views  and  thus  to  act  incorrectly.  In  the  current  plans  for  Isis,  the  protocol  for  a  process  to 
join  a  group  runs  as  follows.  First,  the  requesting  process  specifies  the  group  address  of  the  group 
it  wishes  to  join.  This  address  contains  the  address  of  a  group  contact,  which  is  a  distinguished  site 
in  the  group.  The  process’  site  sends  the  join  request  to  the  group  contact,  wliich  then  formally 
admits  the  process  to  the  group  and  gives  the  process'  site  its  first  group  entity  meml)er  hst.'^ 

In  order  for  the  join  protocol  to  be  secure,  the  process'  site  must  be  able  to  authenticate  the  response 

‘^Moreover,  if  the  requesting  site  included  its  cerlihcate  in  the  retpiesl.  then  ilie  ^ioii|>  coiituct  conUI  :i!>o  leliirii 
the  private  key  of  the  group  encrypted  under  the  site’s  public  key. 
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from  the  group  coutact.  It  would  appear  that  this  is  done  easily  via  the  methods  of  section  4.1; 
the  group  contact  signs  its  response  with  its  site  key  and  appends  its  certificate,  thus  allowing  the 
receiving  site  to  verify  it.  A  difficulty  arises,  however,  if  the  group  address  is  outdated  in  the  sense 
that  the  group  contact  contained  therein  has  since  left  the  group.  Even  though  in  theory  our  model 
prohibits  the  former  group  contact  from  being  corrupted,  in  practice  it  would  be  prudent  for  *'he 
requesting  site  to  verify  that  the  supposed  group  contact  is  genuinely  in  the  jjtoup  before  accepting 
any  group  information  from  it.  To  facilitate  this,  we  include  the  public  key  of  the  group  in  the 
group  address,  and  thus  when  the  requesting  process  specifies  the  group  address,  its  site  can  verify 
that  the  supposed  group  contact  is  actually  in  the  group. 

Of  course,  the  success  of  this  scheme  hinges  on  the  ability  of  a  process  to  obtain  the  authentic 
address  of  a  group  it  wishes  to  join;  for  the  remainder  of  this  section  we  address  tins  issue.  In  Isis, 
a  process  can  obtain  a  group  address  in  either  of  two  ways:  it  can  simply  receive  it  from  another 
application  process,  or  if  the  group  is  registered  at  the  Isis  name  service,  then  the  process  can 
request  the  group  address  from  the  name  service  by  specifying  the  group  name.  The  name  service 
is  a  fault  tolerant  Isis  service  which  implements  a  hierarchical  name  space,  hke  that  of  a  file  system 
except  with  groups  at  the  leaves  instead  of  files.  A  group  name  is  a  path  from  the  root  to  a  leaf  in 
that  hierarchical  name  space.  A  group  member  can  register  the  group  address  under  some  name  at 
the  name  service  anytime  after  the  group  is  created.  A  group  which  has  not  been  registered  with 
the  name  service  is  an  anonymous  group,  and  the  address  of  an  anonymous  group  can  i)e  obtained 
only  from  another  application  process  (or  by  creating  the  group). 

If  a  process  receives  an  address  from  another  application  process,  it  can  trust  that  address  only 
a.s  much  as  it  trusts  the  other  process.  This  is  not  to  say  that  this  scheme  is  worthless  for  secure 
operations.  On  the  contrary,  if  the  sending  process  is  a  iiember  of  either  a  "trustworthy''  process 
group  or  a  group  which  was  delegated  by  such  a  group  (see  section  5).  and  proves  it  liv  having  its 
site  exhibit  knowledge  of  the  private  key  of  the  appropriate  group  and  by  including  the  appropriate 
credentials,  then  the  address  may  be  perfectly  acceptable.  But.  to  verify  the  claims  of  the  >ending 
process,  the  receiving  process  must  obtain  the  group  addresses  (i.e..  the  public  keys)  of  the  delegat¬ 
ing  groups  and  the  group  jf  which  the  sending  process  claims  to  be  a  member.  So.  in  many  cases 
verification  of  group  addresses  received  from  other  processes  eventually  requires  that  the  verifying 
process  be  able  to  obtain  legitimate  group  addresses  from  the  name  service. 

Accordingly,  we  now  consider  how  authentic  group  addres.ses  can  be  ol)tained  through  the  name 
service.  The  authentication  mechanisms  of  section  4.1  can  easily  be  adapted  to  allow  a  site  to 
authenticate  the  name  service,  say  by  having  the  name  service  sign  groiip  addresses  with  its  pri¬ 
vate  key  and  having  the  authentication  service  produce  a  certificate  for  the  name  service.  So. 
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authenticating  information  received  from  the  name  service  is  not  a  problem. 

The  major  impediment  to  the  success  of  this  scheme  is  the  inability  of  the  name  service  to  authen¬ 
ticate  information  sent  to  it  by  a  process  attempting  to  register  a  group.  Intuitively,  this  is  because 
the  name  service  has  no  reason  to  believe  one  process  over  another  regarding  the  correct  address  of 
a  group,  a.s  it  keeps  no  record  of  group  membership.  We  solve  this  problem  by  allowing  processes 
to  impose  access  controls  upon  the  directories  of  the  hierarchical  name  space,  thus  providing  the 
name  service  some  means  by  which  to  discriminate  between  valid  and  invalid  information.  When 
a  process  creates  a  directory  of  the  name  space,  it  specifies  access  control  policy  for  the  directory 
that  restricts  which  processes,  and  in  particular  the  sites  from  which  these  processes,  can  register 
a  group  or  create  a  directory  in  that  directory;  in  section  5.  we  describe  a  method  by  which  this 
access  control  policy  can  be  specified  and  enforced.  The  name  service  will  then  allow  only  an 
authorized  process  (residing  on  an  authorized  site)  to  register  a  group  in  the  directory.  Provided 
that  a  directory  allows  registrations  only  from  nonmahcious  processes  on  uncorrupt  sites,  the  name 
service  can  verify  the  authenticity  of  a  group  address  being  registered  in  that  directory  simply  by 
authenticating  the  registering  process’  site  via  the  methods  of  section  4.1. 

4.3  Causal  Multicast 

As  previously  described,  the  CBCAST  protocol  implements  the  causal  delivery  ordering  property 
of  virtual  synchrony.  It  is  implemented  above  the  MUTS  layer  described  in  section  4.1.  and  thus  its 
messages  can  be  authenticated  by  the  methods  described  there.  Even  having  limited  the  intruder's 
ability  to  alter  and  forge  messages,  though,  there  are  still  difficulties  in  determining  what  causahty 
means,  and  precisely  what  abstraction  we  should  try  <o  protect,  in  our  system  model.  In  tliis 
section  we  address  these  issues,  but  first  we  illustrate  the  role  of  CBCAST  in  the  basic  Isis  model. 

Consider  first  an  instance  of  single  group  causality,  illustrated  in  part  (a)  of  figure  1.  This  shows 
a  single  process  group  with  four  members  pj.  p2,  pa  and  p^.  residing  respectively  on  sites  .sj. 

S3  and  S4.  Time  increases  down  the  vertical  lines,  and  an  arrow  ending  at  the  vertical  hue  below 
a  process  indicates  tne  delivery  of  the  multicast  represented  by  the  arrow  to  the  process.'’'  In  this 
scenario,  p4  multicasts  mi  to  the  group,  and  after  S2  delivers  it  to  p2.  P2  multicasts  /?!)  to  the 
group.  Causality  requires  that  m2  be  delivered  to  pi  after  u?].  as  indicated  by  the  delay  of  message 
m2  until  after  mi  in  part  (a)  of  figure  1. 

'*In  Isis,  we  distinguish  between  the  receipt  of  a  message  at  a  site  and  t  he  rltlivrnj  of  a  message  to  t  he  application 
process. 
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Figure  1:  Causal  Multicast 


(a)  Single  group  causality  (b)  Multiple  group  causality  (c)  Causality  violated 

The  more  complex  flavor  of  causality  is  called  multiple  group  causality,  illustrated  iii  part  (b)  of 
figure  1.  In  this  situation  the  four  processes  are  orgamzed  into  three  overlapping  process  groups 
G-i  =  {Pi^PSiPa},  G2  =  {pZ'Pa}  and  G3  =  {pi,P2}-  Here  p4  multicasts  message  nji  to  group  G]. 
After  mi  is  delivered  to  ps,  pa  multicasts  m2  to  G2,  and  upon  delivery  of  /»  >  fo  p2.  p;  multicasts 
m3  to  G3.  Multiple  group  causality  requires  that  m3  be  delivered  to  pi  after  /uj.  as  indicated  in 
the  figure. 

In  our  system  model,  the  definition  and  preservation  of  causality  is  more  complex.  First,  compli¬ 
cations  arise  from  the  fact  that  (processes  on)  corrupt  sites  can  e.xhibit  arbitrary  communication 
behavior,  and  not  simply  the  group  multicasts  by  wliich  causality  is  defined  in  Isis.  In  fact,  this 
holds  for  any  site  in  a  corrupt  group,  because  the  intruder  can  forge  group  communication  for 
those  sites.  Second,  even  with  a  reasonable  definition  of  causality  that  incorporates  the  behavior 
of  corrupt  sites,  it  is  not  clear  how  we  could  (or  if  we  should)  resi)ect  causal  obUgations  originating 
in  corrupt  groups,  again  because  communication  in  corrupt  groups,  and  thus  the  perceived  order 
in  which  events  occurred  in  a  corrupt  group,  may  be  fallacious.  Third,  wliile  communication  stiU 
occurs  only  through  message  passing,  the  intruder  is  an  observer  of  all  messages  and  can  respond 
from  a  corrupt  site  based  upon  the  information  in  them.  For  e.xample.  in  part  (b)  of  figure  1.  the 
intruder  could  observe  mi  on  the  network  and,  if  ^2  €  C.  could  immediatelv  send  ni-^.  ha.-.ed  upon 
information  in  mi  and  without  waiting  for  m2;  now  is  there  a  causal  relationship  Iretween  /U]  and 
m3?  Of  course,  we  can  avoid  this  issue  by  encrypting  all  raes.'.ages.  but  rhis  i  a  cosrly  'tej)  ro  rake 
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before  even  defining  the  notion  of  causality  in  our  system  model. 

For  the  sake  of  brevity,  we  address  these  issues  elsewhere  and  in  this  paper  provide  the  following 
guarantee;  if  p\,..  .,pn  cire  (not  necessarily  distinct)  processes  residing  on  uncorrupt  sites  and  there 
exists  a  causal  chain 


^  ^ 
<l  »7 


.  .  . 


(1) 


such  that 


•  =  mcastpi(m,G), 

•  ef "  =  mcastp„(m',GOi 

•  G  and  G'  are  uncorrupt,  and 

•  if  =  mcastpj(m,  G)  and  G  is  corrupt,  then  pj  =  pj+]  (i.e..  the  causal  chain  is  well-defined). 

then  deUver<,(m,  <S)  -+  deliver,! m'.  G')  at  all  destinations  q  of  m  and  m'.  We  argue  that  tills 
is  a  reasonable  guarantee  to  provide  for  several  reasons.  First,  single  group  causality,  which  is 
necessary  to  provide  the  ABCAST  property  to  a  group,  is  the  special  case  of  this  guarantee  in  which 
Pi,. .  .,Pn  are  members  of  the  same  group  G  =  G'  and  if  pj  ^  Pj+\-  then  ef-"  =  mcastp^l  m.  G)  and 
=  deliverp^^j(m,G)  for  some  message  m;  i.e..  single  group  causality  is  the  case  in  which  the 
causal  chain  never  leaves  the  group  G.  Second,  if  a  group  must  rely  upon  multiple  group  causality, 
this  guarantee  allows  the  group  to  protect  itself  by  judiciously  choosing  the  groups  with  which  it 
shares  members,  because  uncorrupt  sites  will  observe  incorrect  orderings  between  multicasts  only 
if  the  causal  chain  which  links  them  traverses  a  corrupt  group. Third,  a  stronger  guarantee 
would  be  useless  to  many  applications,  because  an  uncorrupt  group  may  be  contaminated  at  the 
application  level  by  messages  with  a  corrupt  group  in  their  causal  history,  regardless  of  what  sort 
of  causal  guarantees  are  made  regarding  those  messages. 

To  provide  this  guarantee,  we  first  choose  a  secure  protocol  which  provides  single  group  causality 
in  an  uncorrupt  group.  By  the  work  of  section  4.1,  any  protocol  that  mainttiins  all  causality 
information  for  group  communication  local  to  the  group  will  suffice;  one  such  protocol  is  the  vector 
timestamp  protocol  for  a  single  group  described  in  [BSS90],  which  we  do  not  discuss  here.  Second. 

course,  here  we  are  using  the  term  “causal  chain”  informally,  because  one  which  traverses  a  corrupt  group 
may  not  be  well-defined. 
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we  extend  this  protocol  to  account  for  causal  chains  which  exit  and  reenter  the  group.  As  in 
the  single  group  ctise,  any  protocol  which  maintains  the  relevant  causality  information  only  in  the 
groups  encountered  on  the  chain  is  sufficient,  because  we  are  concerned  with  the  case  in  wliich  the 
chain  traverses  oidy  uncorrupt  groups.  However,  for  reasons  wlricli  will  be  outlined  below,  we  adopt 
a  more  cautious  strategy. 

Intuitively,  we  enforce  multiple  group  causality  between  m  and  m'  by  ensuring  before  the  causal 
chain  (1)  even  leaves  the  group  G  that  any  causal  obligations  resulting  from  the  multicast  of  m 
will  be  satisfied.  One  protocol  which  does  this  is  the  conservative  protocol  of  [BSS90].  described  as 
follows.  A  multicast  is  stable  if  it  has  been  received  at  all  of  its  destination  sites,  and  a  group  G  is 
active  for  a  process  p  if  p’s  site  does  not  know  of  the  stability  of  a  multicast  to  G  either  sent  by  or 
delivered  to  p.  The  conservative  multicast  rule  states  that  a  process  p  may  multicast  to  group  G  if 
and  only  if  G  is  the  only  active  group  for  p  or  p  has  no  active  groups.  If  p  attempts  to  multicast 
when  this  rule  is  not  satisfied,  the  multicast  is  delayed,  and  during  this  delay  no  multicasts  are 
delivered  to  p.  So,  in  part  (b)  of  figure  1,  S3  simply  delays  sending  m2  until  it  knows  that  mj 
has  been  received  by  Si.  Then,  the  delivery  algorithm  at  si,  which  specifies  that  two  multicasts  in 
different  groups  are  delivered  in  order  of  receipt,  enforces  the  causal  delivery  property. 

Because  we  are  attempting  to  provide  causal  orderings  defined  by  causal  chains  through  only 
uncorrupt  groups,  more  efficient  protocols  than  the  conservative  one  could  be  used.  We  have 
chosen  the  conservative  protocol,  though,  due  to  its  behavior  in  the  face  of  corruption.  Informally, 
even  if  a  causal  chain  beginning  in  an  uncorrupt  group  passes  through  a  corrupt  group,  the  corrupt 
group  cannot  generate  a  message  based  upon  the  incoming  information  and  effect  its  delivery  to  a 
member  of  the  first  group  prior  to  the  delivery  of  the  initial  multicast  of  the  chain.  Returning  to 
part  (b)  of  figure  1,  what  we  mean  is  that  even  if  S2  €  Cl  it  could  not  manage  to  get  m3  delivered 
to  Pi  before  mi  as  in  p2irt  (c)  of  the  figure. 

Thus,  while  in  this  paper  we  have  not  formally  defined  the  notion  of  a  causal  chain  which  passes 
through  a  corrupt  group,  in  an  informal  sense  the  conservative  protocol  provides  even  a  stronger 
causality  guarantee  than  we  had  promised.  This  type  of  guarantee  may  be  important  to  apphcatioiis 
in  which  the  timing  of  the  release  of  information  is  important.  For  instance,  if  G\  represents  the 
trading  service  of  section  1  and  mi  contains  instructions  to  buy  a  large  quantity  of  a  certain  stock 
for  a  client,  then  after  discovering  the  intended  purchase  via  m2,  the  intruder  may  wish  to  deliver 
a  purchase  order  m3  for  that  same  stock  to  pi,  before  m\  is  delivered  to  pi.  The  conservative 
protocol  prevents  this  sort  of  attack. 

We  conclude  this  section  with  mention  of  an  additional  method  by  which  groups  can  prorect  the 
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causal  chains  on  which  they  rely.  In  Isis,  a  causality  domain  [BCG91]  is  a  set  of  groups  aiuoug 
which  causality  is  preserved.  So  far  we  have  assumed  that  all  groups  are  contained  in  the  same 
causality  domain,  but  in  reality  Isis  supports  many  such  domains.  We  are  currently  considering 
ways  to  protect  access  to  causality  domains  as  an  additional  fire  wall  against  malicious  intrusion. 
However,  this  approach  has  not  yet  been  sufficiently  investigated,  and  we  will  not  discuss  it  further 
here. 


5  Delegation  and  Access  Control 

The  guarantees  provided  to  a  process  group  in  section  4  are  contingent  upon  the  group  being 
uncorrupt.  In  light  of  this,  it  is  obvious  that  in  real  systems,  access  to  groups  must  be  restricted. 
This  is  also  required  if  a  programmer  wishes  to  build  a  ’'trusted"  group:  a  group  ol)viously  cannot 
be  trusted  if  any  process  may  join  it  simply  by  so  requesting! 

As  in  any  situation  requiring  access  control,  we  have  available  to  us  two  basic  approaches,  namely 
access  control  lists  (ACLs)  and  capabilities.  The  advantages  generally  cited  for  .-VCLs  include  that 
they  more  naturally  solve  traceability,  confinement  and  revocation  problems.  Capabilities,  on  the 
other  hand,  better  support  the  principle  of  least  privilege,  allow  for  more  efficient,  decentralized 
transfer  of  access  rights,  and  in  general  can  be  verified  more  efficiently  than  .A.CLs  can  be  checked. 
Severed  authors  have  argued  for  hybrid  schemes  which  e.xploit  the  advantages  of  both  approaches 
(e.g.,  [KH84,Gon89]). 

While  the  advantages  traditionally  cited  for  each  approach  also  apply  in  our  setting,  we  argue 
that  the  advantages  of  capabilities  are  less  applicable  tb  our  needs.  First,  in  the  majority  of  Isis 
applications,  the  membership  of  a  typical  process  group  is  relatively  static.  That  is.  group  joins  are 
infrequent  in  comparison  to  other  group  operations,  such  as  multicasts.  Thu.s.  although  in  general 
capabilities  can  be  verified  more  efficiently  than  ACLs  can  be  checked,  we  e.xpect  that  in  many  ca.>es 
this  would  have  no  significant  effect  on  overtdl  system  performance.  Second,  a  major  pitfall  of  classic 
capability  systems,  namely  that  the  ability  to  access  an  object  imphes  the  alulity  to  grant  access 
to  the  object,  seems  particularly  hazardous  in  our  system  model.  By  passing  capaluhties  to  sites 
outside  the  group,  the  group  places  trust  in  those  external  sites  to  not  propagate  the  capabihties  in 
unintended  ways.  This  may  at  first  appear  to  be  a  moot  point,  because  by  granting  a  capabihty  to 
a  corrupt  site,  the  group  has  also  entrusted  a  corrupt  site  to  enter  a  process  iu  the  group.  However, 
if  the  site  is  subsequently  suspected  of  being  corrupt  and  is  placed  on  an  exception  hst  to  prevent 
any  process  on  that  site  from  joining  the  group,  there  is  still  no  way  to  prevent  the  coiiui)r  sire 
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from  passing  the  capability  to  others.  Indeed,  once  the  capability  has  been  passed  to  a  corrupt 
site,  the  ability  to  control  access  to  the  group  based  upon  the  capability  is  nullified,  because  the 
capability  may  propagate  unchecked. 

For  these  reasons,  classic  capabilities  are  not  the  access  control  mechanism  of  choice  in  our  system, 
and  instead  we  view  ACLs,  or  possibly  a  hybrid  scheme,  as  being  more  suitable.  In  the  remaiiider  of 
this  section,  we  describe  an  access  control  scheme  based  only  upon  ACLs  which  we  plan  to  employ 
in  our  system.  This  scheme  is  sufficiently  powerful  to  be  used  as  the  sole  means  to  control  access  to 
groups,  although  it  could  also  easily  be  adapted  for  use  in  a  hybrid  scheme  such  as  that  in  [GonS9]. 

The  straightforward  criteria  on  which  to  restrict  access  to  groups  is  the  owner  and  site  of  the 
process  requesting  access.  That  is,  when  a  group  is  created,  the  creating  process  would  specify 
a  set  (i.e.,  ACL)  of  (owner,  site)  pairs  which  indicates  the  processes  which  would  be  allowed  to 
join  the  group.  The  problem  with  this  approach  is  that  it  is  not  sufficiently  expressive.  Consider 
an  extension  of  the  NYSE  example  of  section  1  in  which  a  cUent  process  authorizes  the  brokerage 
service  to  purchase  stocks  with  funds  in  the  client's  account  at  XYZ  bank.  After  locating  the  stock, 
the  brokerage  service  must  send  a  representative  to  a  group  established  by  XYZ  bank  to  arrange 
the  fund  transfer  for  the  stock  purchase.  However,  XYZ  bank  will  admit  the  representative  to  tliis 
group  only  if  it  has  been  legitimately  authorized  by  a  client  of  the  XYZ  bank.  Thus,  the  simple 
scheme  of  admitting  the  representative  based  upon  its  owner  (say.  an  individual  ^tock  broker)  and 
hosting  site  is  insufficient  here  for  two  reasons:  this  information  neither  convinces  the  bank  group 
that  the  process  represents  the  brokerage  nor  conveys  the  authorization  granted  l),v  the  client. 

This  flavor  of  authorization  is  closely  related  to  many  concepts  which  have  appeared  in  the  hterature 
in  recent  years,  including  authentication  forwarding  [SN^SS].  cascaded  authentication  [SolSS].  and 
delegation  [GM90].  Informally,  each  of  these  terms  denotes  the  means  by  which  one  party  authorizes 
another  to  act  on  its  behalf,  as  exemplified  by  the  client  delegating  authority  to  the  stock  Inokerage 
in  the  previous  example.  The  delegation  problem  ia  tliis  example  is  how  the  brokerage  representative 
can  convince  the  bank  group  to  admit  it,  given  that  a  chent  has  legitimately  delegated  a\ithority 
to  the  brokerage  service. 

The  delegation  problem  in  a  group  oriented  system  is  different  from  that  in  other  systems  only 
in  the  sense  that  groups,  instead  of  processes,  are  delegating  and  being  delegated.  In  practical 
terms,  this  means  that  groups  need  to  be  authenticated,  instead  of  processes  or  sites.  Fortunately, 
we  already  have  in  place  the  mechanisms  to  do  this,  namely  group  keys  and  the  name  spivice 
introduced  in  sections  4.1  and  4.2. 

The  approach  we  take  to  delegation  is  best  illustrated  Ity  an  example.  Supjtose  that  group  G] 


IS 


wishes  to  delegate  authority  to  group  Gj.  To  do  so  it  sends  to  G2  the  message 


Ti,  G2,5i(Ti,  G2).  (2) 

where  '‘Ti”  is  the  time  at  which  this  delegation  expires.  denotes  the  signature  function  of 
Gi  (i.e.,  signature  with  the  private  key  of  Gi),  and  “Gi"  and  “G2"  are  the  names  of  Gi  and  G2. 
respectively.^®  Intmtively,  a  member  of  G2  can  present  (2)  to  another  party  to  prove  that  any 
member  of  G2  can  speak  on  behalf  of  Gj  until  time  Ti.  The  other  party  verifies  this  clciim  with  the 
address  of  Gi-  (Recall  that  the  address  for  a  group  now  contains  the  group's  public  key. )  Moreover, 
a  process  in  G2  can  delegate  further  to  group  G3  by  sending  it 


Gi,  ri,G2,  r2,  G3,52(5i(ri.  G2).  r2.  g^i.  (3) 

A  recipient  of  this  message  should  beheve  that  a  member  of  G3  has  authority  until  time  minjri.  T2} 
to  act  on  behalf  of  Gi,*^  provided  that  the  message  can  be  verified  by  the  appropriate  pubhc  keys. 
Of  course,  G3  could  delegate  yet  further  in  a  similar  fashion,  and  in  general,  such  delegation  chains 
could  become  arbitrarily  long.  This  scheme  has  many  of  the  same  features  as  that  in  iSolSS].  and 
the  reader  is  referred  there  for  further  discussion. 

A  problem  with  this  delegation  scheme  is  that  the  delegating  group  has  no  means  by  which  to 
restrict  the  authority  it  grants  to  the  group  it  delegates.  For  instance,  after  delegation  (2).  object 
monitors  may  allow  members  of  G2  to  access  any  resource  that  Gi  could,  including  tho^e  that 
Gi  did  not  intend.  This  is  a  problem  common  to  man;^  delegation  schemes,  and  it  can  l)e  dealt 
with  in  several  ways.  In  [S0I88],  with  each  delegation  is  included  a  set  of  constraints  which  may 
explicitly  specify  the  subset  of  resources  normally  accessible  to  the  delegating  party  to  which  the 
delegated  party  should  be  granted  access.  It  appears  more  difficult  to  develop  a  general  access 
control  mechanism  using  constraints,  though,  because  the  form  of  the  constraints  may  be  too 
application  specific.  Another  approach,  which  is  taken  in  [GMOOj  and  which  we  adopt  here,  is  to 
limit  the  access  rights  of  the  delegated  party  through  the  use  of  mles.  Associated  with  each  role 
of  a  group  is  some  subset  of  the  access  rights  of  the  group.  When  a  group  delegates  the  authority 
of  a  role,  the  authority  transferred  to  the  delegated  group  is  only  that  of  the  role,  and  not  of 
the  delegating  group.  So.  in  the  NYSE  example,  the  cUent  group  could  delegate  authority  to  the 

‘*This  form  of  delegation  conceivably  could  admit  the  use  of  group  addresses  instead  of  names,  although  for  the 
purposes  of  access  control  we  allow  delegation  by  name  only. 

‘^Of  course,  an  application  may  interpret  (2)  and  (3)  differently  than  as  we  have  stated.  For  e.xainple.  based  upon 
(3)  an  object  monitor  may  grant  a  member  of  Gi  the  access  rights  of  G'l.  (7;.  or  some  combination  thereof 
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brokerage  service  under  a  role  which  was  used  to  establish  the  bank  account  and  which  would  be 
useless,  say,  for  reading  the  client’s  mail.  Although  delegation  via  roles  is  less  fle.xible  and  requires 
more  forethought  than  the  use  of  constraints,  in  our  case  the  use  of  roles  is  particrdarly  attractive, 
because  a  role  corresponds  to  just  another  group  name.  That  is.  a  group  can  create  roles  for  itself 
by  registering  other  names  for  the  group  with  the  name  service. 

We  can  now  extend  this  delegation  mechanism  into  an  access  control  scheme  as  follows.  A  process 
specifies  access  control  policy  for  a  group  by  providing  a  set  of  delegation  templates  when  it  creates 
the  group,  in  addition  to  the  (owner,  site)  pairs  previously  described.  (Subsequently,  a  member 
of  the  group  can  change  the  access  control  policy  for  the  group  by  adding  or  removing  delegation 
templates  or  (owner,  site)  pairs,  although  doing  so  does  not  remove  any  members  from  the  group. ) 
A  delegation  template  is  a  list  where  each  Q,  is  a  set  of  group  names.  .4  delegation 

template  specifies  a  set  of  delegation  chains  which  are  acceptable  credentials  for  a  process  to  join 
the  group.  A  delegation  chain 


Gj,  Ti,  .  .  .,Gm-l,  Tm-l.  G„,,5n,_l( .  •  •)•  (-1) 

is  said  to  match  the  delegation  template  Qi,....Qn  if  ^  and  for  all  j  satisfying  1  <  ;  <  n. 

Gj  6  Qm-nJr]-  That  is,  the  chain  in  (4)  matches  the  template  Q\ . Q„  if  the  chain  ends  with  a 

sequence  of  delegations  beginning  with  an  element  of  Q\.  followed  by  an  element  of  Q^.  and  so  on. 
and  ending  with  an  element  of  Qn- 

Given  a  set  of  delegation  templates,  access  to  a  group  is  controlled  a.>.  follows.  Suppose  the  group 
contact  receives  message  (4)  embedded  in  a  request  froni  some  process  p  to  join  the  group.  Then. 
p  is  allowed  to  join  if  and  only  if 

1.  the  authenticity  of  message  (4)  can  be  verified  with  the  appropriate  public  keys. 

2.  p's  site  can  vouch  that  p  is  in  G„,  (by  illustrating  knowledge  of  the  private  key  for  G,„ ). 

3.  message  (4)  matches  a  delegation  template  for  the  group. 

4.  none  of  the  delegations  in  message  (4)  have  expired,  and 

5.  the  (owner,  site)  pair  of  p  is  listed  in  the  set  of  (owner,  site)  pairs  for  the  group. 

In  this  way,  the  sets  of  delegation  templates  and  (owner,  site)  pairs  together  constitute  an  .\CL  for 
the  group. 
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Finally,  we  note  that  this  access  control  mechanism  can  be  extended  to  objects  other  than  groups. 
We  have  already  seen  one  need  for  this,  namely  the  directories  of  the  hierarchical  name  space 
implemented  by  the  name  service  described  in  section  4.2.  As  in  many  file  systems,  a  name  service 
directory  has  three  natural  types  of  access  to  it,  namely  search,  read  and  write.  So.  as  when  creating 
a  group,  a  process  can  specify  when  creating  a  directory  in  the  name  service  a  set  of  delegation 
templates  and  a  set  of  (owner,  site)  pairs  for  each  of  these  types  of  access. 


6  Conclusion  and  Future  Work 

In  this  paper  we  have  described  a  security  architecture  for  use  in  the  Isis  toolkit,  but  structured  in 
such  a  way  that  most  mechanisms  should  also  be  useful  in  other  group  oriented  settings.  The  major 
features  of  the  security  architecture  include  a  group  oriented  authentication  subsystem,  a  secure 
method  for  joining  groups,  and  protocols  which  protect  certain  causal  deUvery  ordering  guarantees. 
In  addition,  we  have  proposed  an  access  control  scheme  based  upon  delegation  for  use  in  group 
oriented  settings. 

Future  work  on  this  system  is  heading  in  several  directions.  First,  the  system  is  currently  being 
implemented  at  Cornell  University.  This  implementation  should  provide  valuable  insight  into  the 
efficiency  of  our  architecture  and  mechanisms.  It  is  also  forcing  us  to  consider  user  interface  issues 
—  while  such  constructs  as  delegation  templates  will  certainly  force  the  Isis  interface  to  change,  we 
would  like  to  ensure  that  currently  e.xisting  Isis  applications  can  benefit  from  the  new  mechanisms 
with  minimal  changes.  Second,  in  addition  to  the  e.xtensions  proposed  in  the  previous  sections, 
we  are  pursuing  other  improvements  to  the  basic  architecture.  For  example,  we  are  currently 
investigating  ways  to  incorporate  information  flow  controls  into  Isis  and  to  relate  this  work  to 
the  TCSEC  taxonomy  [DoD83],  which  is  a  set  of  criteria,  covering  issues  such  as  access  control, 
information  flow,  and  covert  channels,  for  classifying  systems  according  to  the  levels  of  security 
assurance  they  provide.  Third,  we  are  also  considering  methods  of  e.xploiting  the  Isi.s  abstractions, 
once  secured,  to  enhance  the  overall  security  of  applications. 
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A  Synchronized  Clocks 

In  section  4.1,  we  described  a  site's  certificate  as  containing  "the  expiration  time  of  the  pul)lic  key  " 
of  the  site.  Presumably,  a  recipient  of  a  certificate  for  some  site  should  be  al)le  to  derenuine  frcun 
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the  expiration  time  whether  the  certificate  is  fresh.  And.  this  concept  of  time  has  other  apphcations 
in  our  system;  e.g.,  in  section  4.1  we  also  use  timestamps  to  ensure  the  freshness  of  comierrions. 
and  in  section  5,  timestamps  are  used  to  expire  delegations. 

In  this  paper,  the  source  of  time  available  to  each  uncorrupt  site  is  assumed  to  be  a  clock  which 
is  synchronized  with  the  clocks  on  all  other  uncorrupt  sites  to  within  e  time  units,  where  for  our 
purposes  e  is  quite  large  relative  to  that  required  in  most  other  applications  using  synchronized 
clocks;  e.g.,  an  e  of  a  few  seconds  may  suffice.  While  synchronized  clocks  are  not  necessary  to  de¬ 
termine  the  freshness  of  communication,  it  was  recognized  early  in  the  literature  that  synchronized 
clocks  can  reduce  communication  in  authentication  protocols  [DS81].  Accordingly,  many  security 
architectures,  such  a.s  Kerberos  [SNS88]  and  DSSA  [GGKL89],  have  employed  synchronized  clocks 
for  precisely  this  purpose. 

However,  it  is  important  that  clocks  be  synchronized  securely  if  they  are  to  be  used  to  protect 
against  attacks  in  a  secure  system.  More  precisely,  the  clock  synchronization  algorithms  must  be 
tolerant  of  attacks  similar  to  those  which  the  system  must  tolerate,  because  if  the  clocks  can  be 
successfully  altered,  the»- ,  e.g.,  uncorrupt  sites  can  become  vulnerable  to  classic  replay  [DS81]  and 
suppress- replay  attacks  [GonOl].  Since  most  operating  systems  do  not  provide  such  a  secure  source 
of  time,  we  are  forced  to  implement  our  own  synchronized  clocks  or  to  find  a  ".nirable  alternarive. 

Fortunately,  a  clock  synchronization  algorithm  based  upon  the  wr -  k  :n  ,C.jS9]  is  currently  being 
implemented  for  use  in  a  real-time  extension  of  the  Is'^  toolkit.  The  algorithm  employs  a  fault 
tolerant  master  clock,  around  which  a  set  of  slave  clocks  (i.e..  sites)  synchronize  perioilically  by 
requesting  the  master's  clock  value,  measuring  the  round-tnp  rospon-c  .ime  from  the  master,  and 
approximating  the  master’s  value  based  upon  the  resppnse  time  and  the  value  in  the  message. 
While  detailed  discussion  of  the  clock  synchronization  algorithm  is  outside  the  scope  of  this  doc¬ 
ument.  we  note  only  that  the  master  clock  will  employ  MUTS  for  communication  and  thus  will 
be  amenable  to  authentication  via  the  mechanisms  described  in  section  4.1.  Therefoip.  we  e.xpect 
that  securely  synchronized  clocks  wliich  satisfy  our  relatively  modest  re<iuirements  ran  l>e  achieved 
easily,  provided  that  the  master  clock  itself  is  not  corrupt. 
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